{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Pipeline Tutorial with Hetero Components"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### install"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`Pipeline` is distributed along with [fate_client](https://pypi.org/project/fate-client/).\n",
    "\n",
    "```bash\n",
    "pip install fate_client\n",
    "```\n",
    "\n",
    "To use Pipeline, we need to first specify which `FATE Flow Service` to connect to. Once `fate_client` installed, one can find an cmd enterpoint name `pipeline`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Usage: pipeline [OPTIONS] COMMAND [ARGS]...\n",
      "\n",
      "Options:\n",
      "  -h, --help  Show this message and exit.\n",
      "\n",
      "Commands:\n",
      "  init       pipeline init\n",
      "  show       - DESCRIPTION: Show pipeline config details for Flow server.\n",
      "  site-info  pipeline site info\n"
     ]
    }
   ],
   "source": [
    "!pipeline --help"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Assume we have a `FATE Flow Service` in 127.0.0.1:9380(defaults in standalone), then exec"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Pipeline configuration succeeded.\n"
     ]
    }
   ],
   "source": [
    "!pipeline init --ip 127.0.0.1 --port 9380"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Hetero Example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " Before start a modeling task, data to be used should be transformed into dataframe. Please refer to this [guide](./pipeline_tutorial_transform_local_file_to_dataframe.ipynb)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `pipeline` package provides components to compose a `FATE pipeline`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "from fate_client.pipeline.components.fate import CoordinatedLR, PSI, Evaluation, Reader\n",
    "from fate_client.pipeline import FateFlowPipeline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Make a `pipeline` instance:\n",
    "\n",
    "    - initiator: \n",
    "        * role: guest\n",
    "        * party: 9999\n",
    "    - roles:\n",
    "        * guest: 9999\n",
    "        * host: 10000\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipeline = FateFlowPipeline().set_parties(guest='9999', host='10000', arbiter='10000')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add `Reader` as first component, which reads in uploaded data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "reader_0 = Reader(\"reader_0\", runtime_parties=dict(guest='9999', host='10000'))\n",
    "reader_0.guest.task_parameters(namespace=\"experiment\", name=\"breast_hetero_guest\")\n",
    "reader_0.hosts[0].task_parameters(namespace=\"experiment\", name=\"breast_hetero_host\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add `PSI` component to perform PSI for hetero-scenario. Specify input data frame from `Reader`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "psi_0 = PSI(\"psi_0\", input_data=reader_0.outputs[\"output_data\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we add training component `Coordinated LR` and another LR component that predicts with model from previous component. Here we show how to feed output data and model from one component to another."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "lr_0 = CoordinatedLR(\"lr_0\", epochs=3,\n",
    "                     batch_size=None,\n",
    "                     optimizer={\"method\": \"rmsprop\", \"optimizer_params\": {\"lr\": 0.1}, \"penalty\": \"l2\", \"alpha\": 0.001},\n",
    "                     init_param={\"fit_intercept\": True, \"method\": \"random_uniform\", \"random_state\": 42},\n",
    "                     train_data=psi_0.outputs[\"output_data\"],\n",
    "                     learning_rate_scheduler={\"method\": \"linear\", \"scheduler_params\": {\"start_factor\": 0.7,\n",
    "                                                                                       \"total_iters\": 100}})\n",
    "            \n",
    "lr_1 = CoordinatedLR('lr_1', test_data=psi_0.outputs['output_data'], input_model=lr_0.outputs['output_model'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To show the evaluation result, an \"Evaluation\" component is needed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "evaluation_0 = Evaluation(\"evaluation_0\",\n",
    "                          runtime_parties=dict(guest='9999'),\n",
    "                          default_eval_setting=\"binary\",\n",
    "                          input_datas=lr_0.outputs[\"train_output_data\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add components to pipeline, in order of execution:\n",
    "\n",
    "    - `psi_0` is responsible for finding overlapping match id\n",
    "    - `lr_0` trains Coordinated LR on data output by `psi_0`\n",
    "    - `lr_1` predicts with model from `lr_0`\n",
    "    - `evaluation_0` consumes `lr_0`'s prediciton result on training data\n",
    "\n",
    "Then compile our pipeline to make it ready for submission."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipeline.add_tasks([reader_0, psi_0, lr_0, lr_1, evaluation_0])\n",
    "pipeline.compile();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, submit(fit) our pipeline:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Job id is 202308311051324015890\n",
      "\n",
      "\u001b[80D\u001b[1A\u001b[KJob is waiting, time elapse: 0:00:00\n",
      "\u001b[80D\u001b[1A\u001b[KJob is waiting, time elapse: 0:00:01\n",
      "\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:02\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:03\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:04\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:05\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:06\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:07\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:08\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:09\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:10\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:11\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:12\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:13\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:14\n",
      "\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:15\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:16\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:17\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:18\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:19\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:20\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:21\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:22\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:23\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:24\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:25\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:26\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:27\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:28\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:29\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:30\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:31\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:32\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:33\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:34\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:35\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:36\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:37\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:38\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:39\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:40\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:41\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:42\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:43\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:44\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:45\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:46\n",
      "\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_1, time elapse: 0:00:47\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_1, time elapse: 0:00:48\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_1, time elapse: 0:00:49\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_1, time elapse: 0:00:50\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_1, time elapse: 0:00:51\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_1, time elapse: 0:00:52\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_1, time elapse: 0:00:53\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_1, time elapse: 0:00:54\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_1, time elapse: 0:00:55\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_1, time elapse: 0:00:56\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_1, time elapse: 0:00:57\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_1, time elapse: 0:00:58\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_1, time elapse: 0:01:00\n",
      "\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task evaluation_0, time elapse: 0:01:01\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task evaluation_0, time elapse: 0:01:02\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task evaluation_0, time elapse: 0:01:03\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task evaluation_0, time elapse: 0:01:04\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task evaluation_0, time elapse: 0:01:05\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task evaluation_0, time elapse: 0:01:06\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task evaluation_0, time elapse: 0:01:07\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task evaluation_0, time elapse: 0:01:08\n",
      "Job is success!!! Job id is 202308311051324015890, response_data={'apply_resource_time': 1693450294353, 'cores': 4, 'create_time': 1693450292411, 'dag': {'dag': {'conf': {'auto_retries': 0, 'computing_partitions': 8, 'cores': None, 'engine': None, 'inheritance': None, 'initiator_party_id': '9999', 'model_id': '202308311051324015890', 'model_version': '0', 'model_warehouse': None, 'priority': None, 'scheduler_party_id': '9999', 'sync_type': 'callback', 'task': None, 'task_cores': None}, 'parties': [{'party_id': ['9999'], 'role': 'guest'}, {'party_id': ['10000'], 'role': 'host'}, {'party_id': ['10000'], 'role': 'arbiter'}], 'party_tasks': {'guest_9999': {'conf': None, 'parties': [{'party_id': ['9999'], 'role': 'guest'}], 'tasks': {'psi_0': {'conf': None, 'inputs': {'data': {'input_data': {'data_warehouse': {'job_id': None, 'name': 'breast_hetero_guest', 'namespace': 'experiment', 'output_artifact_key': None, 'producer_task': None, 'roles': ['guest']}}}, 'model': None}, 'parameters': None}}}, 'host_10000': {'conf': None, 'parties': [{'party_id': ['10000'], 'role': 'host'}], 'tasks': {'psi_0': {'conf': None, 'inputs': {'data': {'input_data': {'data_warehouse': {'job_id': None, 'name': 'breast_hetero_host', 'namespace': 'experiment', 'output_artifact_key': None, 'producer_task': None, 'roles': ['host']}}}, 'model': None}, 'parameters': None}}}}, 'stage': 'train', 'tasks': {'evaluation_0': {'component_ref': 'evaluation', 'conf': None, 'dependent_tasks': ['lr_0'], 'inputs': {'data': {'input_data': {'task_output_artifact': [{'output_artifact_key': 'train_output_data', 'producer_task': 'lr_0', 'roles': ['guest']}]}}, 'model': None}, 'parameters': {'default_eval_setting': 'binary', 'label_column_name': None, 'metrics': None, 'predict_column_name': None}, 'parties': [{'party_id': ['9999'], 'role': 'guest'}], 'stage': 'default'}, 'lr_0': {'component_ref': 'coordinated_lr', 'conf': None, 'dependent_tasks': ['psi_0'], 'inputs': {'data': {'train_data': {'task_output_artifact': {'output_artifact_key': 'output_data', 'producer_task': 'psi_0', 'roles': ['guest', 'host']}}}, 'model': {}}, 'parameters': {'batch_size': None, 'early_stop': 'diff', 'epochs': 5, 'init_param': {'fit_intercept': True, 'method': 'zeros'}, 'learning_rate_scheduler': {'method': 'linear', 'scheduler_params': {'start_factor': 0.7, 'total_iters': 100}}, 'optimizer': {'alpha': 0.001, 'method': 'SGD', 'optimizer_params': {'lr': 0.1}, 'penalty': 'l2'}, 'output_cv_data': True, 'threshold': 0.5, 'tol': 0.0001}, 'parties': None, 'stage': None}, 'lr_1': {'component_ref': 'coordinated_lr', 'conf': None, 'dependent_tasks': ['lr_0', 'psi_0'], 'inputs': {'data': {'test_data': {'task_output_artifact': {'output_artifact_key': 'output_data', 'producer_task': 'psi_0', 'roles': ['guest', 'host']}}}, 'model': {'input_model': {'task_output_artifact': {'output_artifact_key': 'output_model', 'producer_task': 'lr_0', 'roles': ['guest', 'host']}}}}, 'parameters': {'batch_size': None, 'early_stop': 'diff', 'epochs': 20, 'output_cv_data': True, 'threshold': 0.5, 'tol': 0.0001}, 'parties': None, 'stage': 'predict'}, 'psi_0': {'component_ref': 'psi', 'conf': None, 'dependent_tasks': None, 'inputs': {'data': {}, 'model': None}, 'parameters': {}, 'parties': [{'party_id': ['9999'], 'role': 'guest'}, {'party_id': ['10000'], 'role': 'host'}], 'stage': 'default'}}}, 'schema_version': '2.0.0.alpha'}, 'description': '', 'elapsed': 66740, 'end_time': 1693450361127, 'engine_name': 'standalone', 'inheritance': {}, 'initiator_party_id': '9999', 'job_id': '202308311051324015890', 'memory': 0, 'model_id': '202308311051324015890', 'model_version': '0', 'parties': [{'party_id': ['9999'], 'role': 'guest'}, {'party_id': ['10000'], 'role': 'host'}, {'party_id': ['10000'], 'role': 'arbiter'}], 'party_id': '9999', 'progress': 100, 'remaining_cores': 4, 'remaining_memory': 0, 'resource_in_use': False, 'return_resource_time': 1693450361094, 'role': 'guest', 'scheduler_party_id': '9999', 'start_time': 1693450294387, 'status': 'success', 'status_code': None, 'tag': 'job_end', 'update_time': 1693450361127, 'user_name': ''}\n",
      "Total time: 0:01:09\n"
     ]
    }
   ],
   "source": [
    "pipeline.fit()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once training is done, data and model output from trained components may be queried through pipeline api. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>extend_sid</th>\n",
       "      <th>id</th>\n",
       "      <th>label</th>\n",
       "      <th>predict_score</th>\n",
       "      <th>predict_result</th>\n",
       "      <th>predict_detail</th>\n",
       "      <th>type</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>a41979464da4e859ce5f594b3da915820</td>\n",
       "      <td>133</td>\n",
       "      <td>1</td>\n",
       "      <td>0.5453636377530179</td>\n",
       "      <td>1</td>\n",
       "      <td>{'0': 0.4546363622469821, '1': 0.5453636377530...</td>\n",
       "      <td>train_set</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>a41979464da4e859ce5f594b3da9158222</td>\n",
       "      <td>262</td>\n",
       "      <td>0</td>\n",
       "      <td>0.28589260037945036</td>\n",
       "      <td>0</td>\n",
       "      <td>{'0': 0.7141073996205496, '1': 0.2858926003794...</td>\n",
       "      <td>train_set</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>a41979464da4e859ce5f594b3da9158276</td>\n",
       "      <td>116</td>\n",
       "      <td>1</td>\n",
       "      <td>0.7589402080943449</td>\n",
       "      <td>1</td>\n",
       "      <td>{'0': 0.24105979190565507, '1': 0.758940208094...</td>\n",
       "      <td>train_set</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>a41979464da4e859ce5f594b3da91582115</td>\n",
       "      <td>140</td>\n",
       "      <td>1</td>\n",
       "      <td>0.837934821102845</td>\n",
       "      <td>1</td>\n",
       "      <td>{'0': 0.162065178897155, '1': 0.837934821102845}</td>\n",
       "      <td>train_set</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>a41979464da4e859ce5f594b3da91582160</td>\n",
       "      <td>174</td>\n",
       "      <td>1</td>\n",
       "      <td>0.819790248482875</td>\n",
       "      <td>1</td>\n",
       "      <td>{'0': 0.18020975151712504, '1': 0.819790248482...</td>\n",
       "      <td>train_set</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                            extend_sid   id label        predict_score  \\\n",
       "0    a41979464da4e859ce5f594b3da915820  133     1   0.5453636377530179   \n",
       "1   a41979464da4e859ce5f594b3da9158222  262     0  0.28589260037945036   \n",
       "2   a41979464da4e859ce5f594b3da9158276  116     1   0.7589402080943449   \n",
       "3  a41979464da4e859ce5f594b3da91582115  140     1    0.837934821102845   \n",
       "4  a41979464da4e859ce5f594b3da91582160  174     1    0.819790248482875   \n",
       "\n",
       "  predict_result                                     predict_detail       type  \n",
       "0              1  {'0': 0.4546363622469821, '1': 0.5453636377530...  train_set  \n",
       "1              0  {'0': 0.7141073996205496, '1': 0.2858926003794...  train_set  \n",
       "2              1  {'0': 0.24105979190565507, '1': 0.758940208094...  train_set  \n",
       "3              1   {'0': 0.162065178897155, '1': 0.837934821102845}  train_set  \n",
       "4              1  {'0': 0.18020975151712504, '1': 0.819790248482...  train_set  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sbt_0_data = pipeline.get_task_info(\"sbt_0\").get_output_data()[\"train_output_data\"]\n",
    "import pandas as pd\n",
    "pd.DataFrame(sbt_0_data).head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'output_model': {'data': {'estimator': {'end_epoch': 5,\n",
       "    'fit_intercept': True,\n",
       "    'is_converged': False,\n",
       "    'lr_scheduler': {'lr_params': {'start_factor': 0.7, 'total_iters': 100},\n",
       "     'lr_scheduler': {'_get_lr_called_within_step': False,\n",
       "      '_last_lr': [0.07119999999999999],\n",
       "      '_step_count': 5,\n",
       "      'base_lrs': [0.1],\n",
       "      'end_factor': 1.0,\n",
       "      'last_epoch': 4,\n",
       "      'start_factor': 0.7,\n",
       "      'total_iters': 100,\n",
       "      'verbose': False},\n",
       "     'method': 'linear'},\n",
       "    'optimizer': {'alpha': 0.001,\n",
       "     'l1_penalty': False,\n",
       "     'l2_penalty': True,\n",
       "     'method': 'sgd',\n",
       "     'model_parameter': [[0.0],\n",
       "      [0.0],\n",
       "      [0.0],\n",
       "      [0.0],\n",
       "      [0.0],\n",
       "      [0.0],\n",
       "      [0.0],\n",
       "      [0.0],\n",
       "      [0.0],\n",
       "      [0.0],\n",
       "      [0.0]],\n",
       "     'model_parameter_dtype': 'float32',\n",
       "     'optim_param': {'lr': 0.1},\n",
       "     'optimizer': {'param_groups': [{'dampening': 0,\n",
       "        'differentiable': False,\n",
       "        'foreach': None,\n",
       "        'initial_lr': 0.1,\n",
       "        'lr': 0.07119999999999999,\n",
       "        'maximize': False,\n",
       "        'momentum': 0,\n",
       "        'nesterov': False,\n",
       "        'params': [0],\n",
       "        'weight_decay': 0}],\n",
       "      'state': {}}},\n",
       "    'param': {'coef_': [[-0.0878903686629256],\n",
       "      [-0.05677242358584973],\n",
       "      [-0.08771869885341368],\n",
       "      [-0.08136158941522312],\n",
       "      [-0.04950030091279235],\n",
       "      [-0.06369604907729508],\n",
       "      [-0.07172871180928618],\n",
       "      [-0.08904661502230068],\n",
       "      [-0.04913537990226004],\n",
       "      [-0.03418310333218406]],\n",
       "     'dtype': 'float64',\n",
       "     'intercept_': [0.04341809752512136]}}},\n",
       "  'meta': {'batch_size': None,\n",
       "   'epochs': 5,\n",
       "   'init_param': {'fill_val': 0.0,\n",
       "    'fit_intercept': True,\n",
       "    'method': 'zeros',\n",
       "    'random_state': None},\n",
       "   'labels': [0, 1],\n",
       "   'learning_rate_param': {'method': 'linear',\n",
       "    'scheduler_params': {'start_factor': 0.7, 'total_iters': 100}},\n",
       "   'optimizer_param': {'alpha': 0.001,\n",
       "    'method': 'sgd',\n",
       "    'optimizer_params': {'lr': 0.1},\n",
       "    'penalty': 'l2'},\n",
       "   'ovr': False,\n",
       "   'threshold': 0.5}}}"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lr_0_model = pipeline.get_task_info(\"lr_0\").get_output_model()\n",
    "lr_0_model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To run prediction, trained components should first be deployed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipeline.deploy([psi_0, lr_0]);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then, get deployed pipeline."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "deployed_pipeline = pipeline.get_deployed_pipeline()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Specify data input for predict pipeline."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "reader_1 = Reader(\"reader_1\", runtime_parties=dict(guest=guest, host=host))\n",
    "reader_1.guest.task_parameters(namespace=f\"experiment\", name=\"breast_hetero_guest\")\n",
    "reader_1.hosts[0].task_parameters(namespace=f\"experiment\", name=\"breast_hetero_host\")\n",
    "deployed_pipeline.psi_0.input_data = reader_1.outputs[\"output_data\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add components to predict pipeline in order of execution:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "predict_pipeline = FateFlowPipeline()\n",
    "predict_pipeline.add_tasks([reader_1, deployed_pipeline])\n",
    "predict_pipeline.compile();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then, run prediction job"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Job id is 202308311054193818250\n",
      "\n",
      "\u001b[80D\u001b[1A\u001b[KJob is waiting, time elapse: 0:00:00\n",
      "\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:01\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:02\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:03\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:04\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:05\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:06\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:07\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:08\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:09\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:10\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:11\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task psi_0, time elapse: 0:00:12\n",
      "\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:13\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:14\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:15\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:16\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:17\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:18\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:19\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:20\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:21\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:22\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:23\n",
      "\u001b[80D\u001b[1A\u001b[KRunning task lr_0, time elapse: 0:00:24\n",
      "Job is success!!! Job id is 202308311054193818250, response_data={'apply_resource_time': 1693450459527, 'cores': 4, 'create_time': 1693450459392, 'dag': {'dag': {'conf': {'auto_retries': 0, 'computing_partitions': 8, 'cores': None, 'engine': None, 'inheritance': None, 'initiator_party_id': '9999', 'model_id': '202308311054193818250', 'model_version': '0', 'model_warehouse': {'model_id': '202308311051324015890', 'model_version': '0'}, 'priority': None, 'scheduler_party_id': '9999', 'sync_type': 'callback', 'task': None, 'task_cores': None}, 'parties': [{'party_id': ['9999'], 'role': 'guest'}, {'party_id': ['10000'], 'role': 'host'}, {'party_id': ['10000'], 'role': 'arbiter'}], 'party_tasks': {'guest_9999': {'conf': None, 'parties': [{'party_id': ['9999'], 'role': 'guest'}], 'tasks': {'psi_0': {'conf': None, 'inputs': {'data': {'input_data': {'data_warehouse': {'job_id': None, 'name': 'breast_hetero_guest', 'namespace': 'experiment', 'output_artifact_key': None, 'producer_task': None, 'roles': ['guest']}}}, 'model': None}, 'parameters': None}}}, 'host_10000': {'conf': None, 'parties': [{'party_id': ['10000'], 'role': 'host'}], 'tasks': {'psi_0': {'conf': None, 'inputs': {'data': {'input_data': {'data_warehouse': {'job_id': None, 'name': 'breast_hetero_host', 'namespace': 'experiment', 'output_artifact_key': None, 'producer_task': None, 'roles': ['host']}}}, 'model': None}, 'parameters': None}}}}, 'stage': 'predict', 'tasks': {'lr_0': {'component_ref': 'coordinated_lr', 'conf': None, 'dependent_tasks': ['psi_0'], 'inputs': {'data': {'test_data': {'task_output_artifact': {'output_artifact_key': 'output_data', 'producer_task': 'psi_0', 'roles': ['guest', 'host']}}}, 'model': {'input_model': {'model_warehouse': {'output_artifact_key': 'output_model', 'producer_task': 'lr_0', 'roles': ['guest', 'host']}}}}, 'parameters': {'batch_size': None, 'early_stop': 'diff', 'epochs': 5, 'init_param': {'fit_intercept': True, 'method': 'zeros'}, 'learning_rate_scheduler': {'method': 'linear', 'scheduler_params': {'start_factor': 0.7, 'total_iters': 100}}, 'optimizer': {'alpha': 0.001, 'method': 'SGD', 'optimizer_params': {'lr': 0.1}, 'penalty': 'l2'}, 'output_cv_data': True, 'threshold': 0.5, 'tol': 0.0001}, 'parties': None, 'stage': None}, 'psi_0': {'component_ref': 'psi', 'conf': None, 'dependent_tasks': None, 'inputs': {'data': {}, 'model': None}, 'parameters': {}, 'parties': [{'party_id': ['9999'], 'role': 'guest'}, {'party_id': ['10000'], 'role': 'host'}], 'stage': 'default'}}}, 'schema_version': '2.0.0.alpha'}, 'description': '', 'elapsed': 25170, 'end_time': 1693450484709, 'engine_name': 'standalone', 'inheritance': {}, 'initiator_party_id': '9999', 'job_id': '202308311054193818250', 'memory': 0, 'model_id': '202308311054193818250', 'model_version': '0', 'parties': [{'party_id': ['9999'], 'role': 'guest'}, {'party_id': ['10000'], 'role': 'host'}, {'party_id': ['10000'], 'role': 'arbiter'}], 'party_id': '9999', 'progress': 100, 'remaining_cores': 4, 'remaining_memory': 0, 'resource_in_use': False, 'return_resource_time': 1693450484675, 'role': 'guest', 'scheduler_party_id': '9999', 'start_time': 1693450459539, 'status': 'success', 'status_code': None, 'tag': 'job_end', 'update_time': 1693450484709, 'user_name': ''}\n",
      "Total time: 0:00:25\n"
     ]
    }
   ],
   "source": [
    "predict_pipeline.predict();"
   ]
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "ad4309918fa4cd1705b305e369b2f64d901b1851e9144aef7b9b07ea3efcb1bb"
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
